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1. INTRODUCTION 

MINRES-QLP [Choi 2006; Choi et al. 2011] is a Krylov subspace method for comput- 
ing the minimum-length and minimum-residual solution (also known as the pseu- 
doinverse solution) x to the following linear systems or least-squares (LS) problems: 



where A is an n x n symmetric or Hcrmitian matrix and 6 is a real or complex 
n-vector. Problems (1) and (2) are treated as special cases of (3). The matrix A 
is usually large and sparse, and it may be singular. 1 It is defined by means of a 
user-written subroutine Aprod, whose function is to compute the product y = Av 
for any given vector v. 

Let Xk be the solution estimate associated with MINRES-QLP's fcth iteration, 
with residual vector r k = b — Ax k . Without loss of generality, we define xq = 0. 
MINRES-QLP provides recurrent estimates of ||x fe ||, ||r fe ||, ||Ar fe ||, \\A\\, cond(A), 
and H-AarfcH, which are used in the stopping conditions. 

Other iterative methods specialized for symmetric systems Ax = b are the 
conjugate-gradient method (CG) [Hestenes and Stiefel 1952], SYMMLQ and MIN- 
RES [Paige and Saunders 1975], and SQMR [Freund and Nachtigal 1994]. Each 
method requires one product Avk at each iteration for some vector v k . CG is 
intended for positive-definite A, whereas the other solvers allow A to be indefinite. 

If A is singular, SYMMLQ requires the system to be consistent, whereas MINRES 
returns an LS solution for (3) but generally not the min-length solution; see [Choi 
2006; Choi ct al. 2011] for examples. SQMR without preconditioning is mathe- 
matically equivalent to MINRES but could fail on a singular problem. To date, 
MINRES-QLP is probably the most suitable CG-type method for solving (3). 

In some cases the more established symmetric methods may still be preferable. 

(1) If A is positive definite, CG minimizes the energy norm of the error \\x — x k \\A 
in each Krylov subspace and requires slightly less work per iteration. However, 
CG, MINRES, and MINRES-QLP do reduce \\x - x k \\ A and \\x - x k \\ monoton- 
ically. Also, MINRES and MINRES-QLP often reduce ||r fc || to the desired level 
significantly sooner than does CG, and the backward error for each x k decreases 
monotonically. (See Section 2.4 and [Fong 2011; Fong and Saunders 2012].) 

(2) If A is indefinite but Ax — b is consistent (e.g., if A is nonsingular) , SYMMLQ 
requires slightly less work per iteration, and it reduces the error norm ||x — Xfc|| 
monotonically. MINRES and MINRES-QLP usually reduce ||x-x fe || [Fong 2011; 
Fong and Saunders 2012]. 

(3) If A is indefinite and well-conditioned and Ax = 6 is consistent, MINRES might 
be preferable to MINRES-QLP because it requires the same number of iterations 
but slightly less work per iteration. 

1 A further input parameter a (a real shift parameter) causes MINRES-QLP to treat "A" as if 
it were A — al. For example, "singular A" really means that A — al is singular. 
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solve Ax = b, 
minimize ||x||2 s.t. Ax = b, 
minimize || x| 1 2 s.t. x E argmin \\Ax - b\\ 2 , 



(1) 
(2) 
(3) 
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(4) MINRES and MINRES-QLP require a preconditioner to be positive definite. 
SQMR might be preferred if A is indefinite and an effective indefinite precon- 
ditioner is available. 

MINRES-QLP has two phases. Iterations start in the MINRES phase and trans- 
fer to the MINRES-QLP phase when a subproblem (see (8) below) becomes ill- 
conditioned by a certain measure. If every subproblem is of full rank and well- 
conditioned, the problem can be solved entirely in the MINRES phase, where the 
cost per iteration is essentially the same as for MINRES. In the MINRES-QLP phase, 
one more work vector and 5n more multiplications are used per iteration. 

MINRES-QLP described here is implemented in FORTRAN 90 for real double- 
precision problems. It contains no machine-dependent constants and does not need 
to use features such as polymorphism from FORTRAN 2003 or 2008. It requires an 
auxiliary subroutine Aprod and, if a preconditioner is supplied, a second subroutine 
Msolve. Since FORTRAN 90 contains the intrinsic COMPLEX data type, our imple- 
mentation is also adapted for complex problems. Precision other than double can 
be handily obtained by supplying different values to the data attribute KIND. The 
program can be compiled with FORTRAN 90 and FORTRAN 95 compilers such as 
f 90, f 95, g95, and gf ortran. We also have a MATLAB implementation, which is 
capable of solving both real and complex problems readily. All versions are available 
for download at [SOL]. 

Table I lists the main notation used. 



Table I. Key notation. 




matrix or vector two-norm 




A 


A = A — crl (see also a below) 




cond(A) 


condition number of A with respect to two-norm = — 

1 mm A ■ 


e» 
I 


ith unit vector 

index of the last Lanczos iteration when ft+i = 


n 


order of A 




null(A) 


null space of A defined as {x £ R n | Ax = 0} 




range (A) 


column space of A defined as {Ax \ x 6 R n } 




T 


(right superscript to a vector or a matrix) transpose 




*t 


unique minimum-length least-squares solution of problem 


(3) 


K k (A,b) 


feth Krylov subspace defined as span{6, Ab, . . . , A k ~ 1 b} 




e 


machine precision 




a 


scalar shift to diagonal of A 





1.1 Least-Squares Methods 

Further existing methods that could be applied to (3) are CGLS and LSQR [Paige 
and Saunders 1982a; 1982b], LSMR [Fong and Saunders 2011], and GMRES [Saad 
and Schultz 1986], all of which reduce ||rfc|| monotonically. The first three methods 
would require two products Avk and Auk each iteration and would be generating 
points in less favorable subspaces. GMRES requires only products Avk and could 
use any nonsingular (possibly indefinite) preconditioner. It needs increasing storage 
and work each iteration, perhaps requiring restarts, but it could be more effective 
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Table II. Comparison of various least-squares solvers on n X n systems (3). Storage refers to 
memory required by working vectors in the solvers. Work counts number of floating-point multi- 
plications. On inconsistent systems, all solvers below except MINRES and GMRES with restart 
parameter m return the minimum-length LS solution (assuming no prcconditioner). 



Solver 


Storage 


Work per 


Products per 


Systems to Solve per 






Iteration 


Iteration 


Iteration with Preconditioner 


MINRES 


In 


9ra 


1 


1 


MINRES-QLP 


7n-8n 


9ra-14n 


1 


1 


GMRES (m) 


(m + 2)n 


(m + 3 + l/m)n 


1 


1 


CGLS 


An 


5n 


2 


2 


LSQR 


5n 


8n 


2 


2 


LSMR 


6n 


9n 


2 


2 



than MINRES or MINRES-QLP (and the other solvers) if few total iterations were 
required. Table II summarizes the computational requirements of each method. 

1.2 Regularization 

We do not discourage using CGLS, LSQR, or LSMR if the goal is to regularize an 
ill-posed problem using a small damping factor A > as follows: 



A 
XI 



(4) 



However, this approach destroys the original problem's symmetry. The normal 
equation of (4) is (A 2 + \ 2 I)x = Ab, which suggests that a diagonal shift to A may 
well serve the same purpose in some cases. For symmetric positive-definite A, A = 
A — a I with a < enjoys a smaller condition number. When A is indefinite, a good 
choice of a may not exist, for example, if the eigenvalues of A were symmetrically 
positioned around zero. When this symmetric form is applicable, it is convenient in 
MINRES and MINRES-QLP; see (3), (5), and (15). We also remark that MINRES and 
MINRES-QLP produce good estimates of the largest and smallest singular values of 
A (via diagonal values of Rk or in (7) and (11); see [Choi et al. 2011, Section 4]). 

Three other regularization tools in the literature (see [Golub and Van Loan 1996, 
Sections 12.1.1-12.1.3] and [Hansen 1998]) are LSQI, cross-validation, and L-curve. 
LSQI involves solving a nonlinear equation and is not immediately compatible with 
the Lanczos framework. Cross-validation takes one row out at a time and thus does 
not preserve symmetry. The L-curve approach for a CG-type method takes iteration 
k as the regularization parameter [Hansen 1998, Chapter 8] if both ||j-fc|| and ||xfc|| 
are monotonic. By design, ||rfc|| is monotonic in MINRES and MINRES-QLP, and so 
is \\xk\\ when A is positive definite [Fong 2011]. Otherwise, we prefer the condition 
L-curve approach in [Calvetti et al. 2000], which graphs cond(Tfc) against ||rfc||. Yet 

(2") 

another L-curve feasible in MINRES-QLP is ||x^,_ 2 || against ||rfc||, since the former 
is also monotonic (but available two iterations in lag); see Section 2.4. 

2. MATHEMATICAL BACKGROUND 

Notation and details of algorithmic development from [Choi 2006; Choi et al. 2011] 
are summarized here. 
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2.1 Lanczos Process 

MINRES and MINRES-QLP use the symmetric Lanczos process [Lanczos 1950] to 
reduce A to a tridiagonal form T k . The process is initialised with v = O,0i = \\b\\, 
and j3\Vi = b. After k steps of the tridiagonalization, we have produced 

Pk = Av k - av k , a k = v k p k , Pk+\v k +i =Pk~ a k v k - k v k -i, (5) 

where we choose (3 k > to give ||vfc|| = 1. Numerically, 

p k = Av k - av k - fihVk-i, c*k=v k p k , lik+iv k +i = p k - a k v k 

is slightly better than (5) [Paige 1976], but we can express (5) in matrix form: 

T k 



V k = [v t 



v k \ 



Av k = v^n, 



n = 



Pk+iei 



(6) 



where T k = tridiag(/3 i7 a i} ft+i), i = 1, . . . ,k. In exact arithmetic, the Lanczos 
vectors in the columns of V k are orthonormal, and the process stops with k = £ 
when = for some t < n, and then AVe = VpT^. The rank of Tg could be £ or 
I - 1 (see Theorem 2.2). 

2.2 MINRES Phase 

MINRES-QLP typically starts with a MINRES phase, which applies a series of re- 
flectors Q k to transform T k to an upper triangular matrix R k : 



Qk[Tk faei 



Rk t k 

4> k 



[Rk t k +i] , 



(7) 



where 



Qk = Qk,k + 1 



Qk- 



Q 



k,k+l = 



Ik- 



s k -Cfc 



In the kth step, Qk,k+i is effectively a Householder reflector of dimension 2 [Trc- 
fethen and Bau 1997, Exercise 10.4]; and its action including its effect on later 
columns of 7} , k < j < £, is compactly described by 



Cfc s k 
Sk -Cfc 



7fc 4+i 

fik+l Ckfc+i /3fc + 2 



4>k-i 










(2) 

Tk 




k+ l £fc+2 
7fc+l 4+2 



4>k 



where the superscripts with numbers in parentheses indicate the number of times 
the values have been modified. The kth solution approximation to (3) is then 
defined to be x k = V k y k , where y k solves the subproblem 



Vk = arg min \\T k y - /?iei|| = arg min \\R k y - t k +i\ 

y<£R k y£R k 



(8) 



When k < £, R k is nonsingular and the unique solution of the above subproblem 
satisfies R k y k = t k . Instead of solving for y k , MINRES solves R k D]^ — V k by 
forward substitution, obtaining the last column d k of D k at iteration k. At the 
same time, it updates x k e K. k (A, b) (see Table I for definition) via x = and 

Xk = V k y k = D k R k y k = D k t k = x k _ 1 + T k d k , r k ee el t k , (9) 
where one can show using V k = D k R k that d k — (v k — 5^d k -i — e k d k ^ 2 )/"f^ ■ 
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2.3 MINRES-QLP Phase 

The MINRES phase transfers to the MINRES-QLP phase when an estimate of the 
condition number of A exceeds an input parameter trancond. Thus, trancond > 1/e 
leads to MINRES iterates throughout (where e w 1CT 16 denotes the floating-point 
precision), whereas trancond = 1 generates MINRES-QLP iterates from the start. 

Suppose for now that there is no MINRES phase. Then MINRES-QLP applies left 
reflections as in (7) and a further series of right reflections to transform Rk to a 
lower triangular matrix Lk — RkPk, where 



Pk — P\.2 P\fiPlfi 

Ik -3 



Pk-2,k — 



Cfc2 S fc2 
1 

Sfc2 -Cfc 2 



-Pfc-2.fe-Pfe-l.fe, 



fc-l,fc — 



Cfc3 Sfc3 
Sk3 -Cfc3 



In the fcth step, the actions of Pk-2,k and Pk-i,k are compactly described by 



7fc-2 



' (6) 
7fc-2 



i) 



(2) „(4) 
fe-1 'fc-1 



L m 



£fc 

7f 



4 3) 
7i 3) 





~Cfc2 

1 


Sfc2~ 




_Sfc2 


-Cfc2. 




"1 






Cfe3 


Sfc3 




. S k 3 


-c fe3 _ 



Cfc3 Sk3 
Sfc3 "Cfe3_ 



' (6) 
7fc-2 



(2) 
fc-1 

m 



(5) 

7fc-i 



7 fc 



(4) 



The fcth approximate solution to (3) is then defined to be x k = V k y k 
WkUk, where Uk solves the subproblem 



(10) 



VfcPfcMfc = 



Uk = argmin ||u|| s.t. 



u £ arg mm 

«eR fc 





Lk 


u — 


tk 









4>k_ 





(11) 



For k < i, Rk and Lk are nonsingular because Tk has full column rank by Lemma 2.1 
below. It is only when k — i and b ^ range(A) that Rk and Lk are singular with rank 
I — 1 by Theorem 2.2, in which case one can show that % = 7^ = = 7^ = 
in (10) and Li = [ Lt ~ 1 °] with nonsingular. In any case, we need to solve only 
the nonsingular lower triangular systems L k u k = t k or Li_iu e _i = Then, u fe 

and j/fe = PfeUfe are the min-length solutions of (11) and (8), respectively. 

MINRES-QLP updates Xk-2 to obtain Xk by short-recurrence orthogonal steps: 



c (2) 

L fc-2 
Xk 



„( 2 ) 



,0) „„(4) 



„( 2 ) 



Z fe 7_ 3 + Mfe-2 W fc-2' whcrC 4-3 
(2) (2) (3) . (2) 



W (4) u (3) 

W k-3 U k-3> 



(12) 
(13) 

Here Wj refers to the jth column of Wk = VkPk, and /ij is the ith element of Uk- 

If this phase is preceded by a MINRES phase of k iterations (0 < k < £), it 
starts by transferring the last three vectors dk-2, dk-i, dk to Wfc-2, Wk~i, Wk, 

and the solution estimate Xk from (9) to x k 2 }_ 2 in (12). This needs the last two 

(2) 

rows of LfeUfc = tk (to give Hk-\, ^k) and the relations Wk = DkLk and x k _ 2 = 
Xk — /ifc-iiun — HkWk- The cheaply available right reflections Pk and the bottom 
right 3x3 submatrix of L k (i.e., the last term in (10)) need to have been saved in 
the MINRES phase in order to facilitate the transfer. 
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2.4 Norm Estimates and Stopping Conditions 

Short-term recurrences are used to estimate the following quantities: 





* 4>k = 


4>k-lSk, 


00 


= \\b\\ 


(A \) 


\\Ar k \\ f 


zipk = 


<t>k\\[lk+i 5 k+2 ] ||, 






(A = o) 


Il4 2) lh 


- ( 2 ) 
° Xfc-2 


= ii[xi 2 2 3 Am. 


X-2 = X-1 


= 




\\ x k\\ ? 


a Xfc = 


||[& A Wblll, 


Xo 


= 


(x, = 11^11) 


\\Ax k \\ , 


a Wfe = 


||[Wfe_l T-fe] || , 




= 






a A k = 


: max{A-i, ||Tfce fc ||,7 fe } , 


A 


= 




cond(A) ? 


a Kfc = 


A/7 fc , 


Kq 


= 1 


(« fc /> cond(A)) 



where j k and 7 fe are the largest and smallest absolute values of diagonals of L k , 
respectively. The up (down) arrows in parentheses indicate that the quantities are 
monotonic increasing (decreasing) if such properties exist. The last two estimates 
tend to their targets from below; see [Choi 2006; Choi et al. 2011] for derivation. 

MINRES-QLP has 14 possible stopping conditions in five classes that use the 
above estimates and optional user-input parameters itnlim, rtol, Acondlim, and 
maxxnorm: 

(CI) From Lanczos and the QLP factorization: 

k = itnlim; j3 k+ i < e; |7^| < e ! 
(C2) Normwise relative backward errors (NRBE) [Paige and Strakos 2002]: 

IH|/(|H|||zfc|| + H&H) < max(rtoZ,e); \\Ar k \\/ (p||||r fc ||) < max(rto;,e); 
(C3) Regularization attempts: 

cond(A) > min( Acondlim, 0.1 /e); ||a;fe|| > maxxnorm; 
(C4) Degenerate cases: 

(3i = => 6=0 => x = is the solution; 
/3 2 = => v 2 = => Ab = aib, 

i.e., b and ot\ are an eigenpair of A, and x = b/ot\ solves Ax = b; 

(C5) Erroneous inputs: 

A not symmetric; M not symmetric; M not positive definite; 

where M is a preconditioncr to be described in the next section. For symmetry 
of A, it is not practical to check ef Aej = ejAei for alH, j — 1, . . . , n. Instead, 
we statistically test whether z — \x T (Ay) — y T (Ax)\ is sufficiently small for 
two nonzero n- vectors x and y (e.g., each element in the vectors is drawn from 
the standard normal distribution). For positive definitcness of M, since M is 
positive definite if and only if M _1 is positive definite, we simply test that 
zJ,M~ x z k = z\q k > each iteration (see Section 3). 
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We find that the recurrence relations for 4>k and ^fc hold to high accuracy. Thus 
Xk is an acceptable solution of (3) if the computed value of <fik or tpk is suitably 
small according to the NRBE tests in class (C2) above. When a condition in (C3) 
is met, the final Xk may or may not be an acceptable solution. 

The class (CI) tests for small flk+i and 7^ are included in the unlikely case in 
practice that the theoretical Lanczos termination occurs. Ideally one of the NRBE 
tests should cause MINRES-QLP to terminate. If not, it is an indication that the 
problem is very ill-conditioned, in which case the regularization and preconditioning 
techniques of Sections 1.2 and 3 may be helpful. 

2.5 Two Theorems 

We complete this section by presenting two theorems from [Choi et al. 2011] with 
slightly simpler proofs. 

Lemma 2.1. rank(T fe ) = k for all k < I. 

PROOF. For fc < I we have Pi,... , Pk+i > by definition. Hence T\ has full 
column rank. □ 

Theorem 2.2. Tg is nonsingular if and only if b <E range(A). Furthermore, 
rank(T» =1-1 ifb£ rangc(A). 

Proof. We use AVe = V{Tg twice. First, if Tg is nonsingular, we can solve 
Teye = fii&i and then AVeye = VeTeyi = VefSiei = b. Conversely, if & € range(^4), 
then range(T4) - rangc(A). Suppose Tg is singular. Then there exists z ^ such 
that VgTgz = AVez = 0. That is, 7^ Vgz e null(yl). But this is impossible because 
V11Z G range(A) and null(A) n range(I4) = 0. Thus, Tg must be nonsingular. 

We have shown that if b ^ rangc(j4), Tg = Te-i ^ e ^~ 1 is singular, and therefore 



I > rank(T £ ) > rank(7>_i) = I - 1 by Lemma 2.1. Therefore, rank(T^) =1-1. □ 

By Lemma 2.1 and Theorem 2.2 we are assured that the QLP decomposition 
without column pivoting [Stewart 1999; Choi et al. 2011] for Tk is rank-revealing, 
which is a necessary precondition for solving a least-squares problem. 

Theorem 2.3. In MINRES-QLP, x e is the minimum-length solution of (3). 

PROOF. y t comes from the min-length LS solution of T e y e w (3iei and thus 
satisfies the normal equation Tjyi = Tifiiei and ye G ranged). Now xe = Veye 
and Ax e = AV e y e = V e T e y e . Hence A 2 x e = AV e Tey e = V t T 2 y t = V e T e f3iei = Ab. 
Thus X£ is an LS solution of (3). Since ye € rangc(Tf), ye = Tez for some z, and so 
xe = Veyi = VeTez = AVez <G range(A) is the min-length LS solution of (3). □ 

3. PRECONDITIONING 

Iterative methods can be accelerated if preconditioners are available and well- 
chosen. For MINRES-QLP, we want to choose a symmetric positive-definite ma- 
trix M to solve a nonsingular system (1) by implicitly solving an equivalent sym- 
metric consistent system M~^AM~^x = b, where M?x = x, b = M~*b, and 
cond(M~5 AM^s ) <g; cond(^4). This two-sided preconditioning preserves symme- 
try. Thus we can derive preconditioned MINRES-QLP by applying MINRES-QLP to 
the equivalent problem and obtain x = M~zx. 
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With preconditioned MINRES-QLP, we can solve a singular consistent system (2), 
but we will obtain a least-squares solution that is not necessarily the minimum- 
length solution (unless M = I). For inconsistent systems (3), preconditioning 
alters the least-squares norm to || • Hm- 1 ; an d the solution is of minimum length in 
the new norm space. We refer readers to [Choi et al. 2011, Section 7] for a detailed 
discussion of various approaches to preserving the two-norm "minimum length." 

To derive MINRES-QLP, we define 

z k = f3 k M^v k , q k = /3 k M~h k , so that Mq k = z k . (14) 



Then (3 k = \\(3 k v k \\ = ||M - 5z fe || = ||zfc|| M -i = \\lk\\M = y Q k z k, where the square 
root is well defined because M is positive definite, and the following expressions 
replace the quantities in (5) in the Lanczos iterations: 

A 1 T 1 a k fik / 1 ki 

p k =Aq k -aq k , a k = ~^q k Pk, z k+1 = —p k ~ —z k - - z k -\. (15) 

P k Pk Pk Pk-i 

We also need to solve the system Mq k = z k in (14) at each iteration. 

In the MINRES phase, we define d k = M~^d k and update the solution of the 
original problem (1) by 

dk = (^-<Zfe - 4 2)( 4-i - efcdfc-2) / l k \ Xk = M~^x k = x fe _! + r k d k . 

In the MINRES-QLP phase, we define W k = M~^W k = (M-^V k )P k and update 
the solution estimate of problem (1) by orthogonal steps: 

w k = -{c k2 /l3 k )q k + s k2 w k %, w k % = (s k2 //3 k )q k + c k2 w k %, 

_(2) _(2) _ _(3) -(2) . 

W k = Sfe3^fe_l - c k3 w k , w k J_ 1 = c k3 w k ^ 1 + S k3 W k , 

(2) _ (2) , (3) -(4) _ (2) (2) _(3) -(2) 

x k-2 — x k-3 ' Vk-2 w k-2i x k — x k - 2 + Mfe-l w fe-l + M/c^fe • 

Let ffe = b — M~^AM~ix k — M~^r k . Then x k — M~^x k is an acceptable 
solution of (1) if the computed value of <f> k w ||ffe|| = 1 1 Tfc 1 1 1 is sufficiently small. 

We can now present our pseudocode in Algorithm 1. The reflectors are imple- 
mented in Algorithm 2 SymOrtho(a, b) for real a and b, which is a stable form for 
computing r = \J a 1 + b 2 > , c = - , and s — - . The complexity is at most 6 flops 
and a square root. Algorithm 1 lists all steps of MINRES-QLP with precondition- 
ing. For simplicity, w k is written as w k for all relevant k. Also, the output x solves 
Ax w b, but other outputs are associated with the preconditioned system. 

4. KEY FORTRAN 90 DESIGN FEATURES 

Our FORTRAN 90 package contains the following files for symmetric problems with 
the first three files forming the core. Their dependencies are depicted in Figure 1. 

1. minresqlpDataModule.f 90: defines precision and constants used in other mod- 
ules 

2. minresqlpBlasModule.f90: packages some BLAS functions [Burkardt] 

3. minresqlpModule . f 90: implements MINRES-QLP with preconditioning 
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Algorithm 1: Pseudocode of preconditioned MINRES-QLP for solving 
(A — al)x « b. In the right-justified comments, A = M~5 (A — aI)M~2 . 



input: A,b,a,M 

1 zo = 0, zi = b, Solve Mqi = zi, Pi = 

2 TO = TO_i = 0, X-2 — X-l = xo = 

3 Co,l=C ,2=Co,3 = — 1, S0,1=S0,2=S0,3=0, 

4 K = 1, A) = Si = 7-1 = 7() = 77-1 = 770 = 

5 fc = 

6 while no stopping condition is satisfied do 

7 



[Initialize] 



7b=w = X-2 = X-i : 
r?i = #-i = #o = fii 



= Xo = 
: M-i = Mo 



[Preconditioned Lanczos] 



\p k ~^z k - 



-Zk-l 



k^k+1 
p k = Aq k - 

Zk+1 - ~ Jk' 

Solve Mq k +i = z k +i, Pk+i = yj q k+1 z k +i 
if fc = 1 then p k = ||[a fc /3fc+i]|| else p fc = || [/3/t a* P k +i]\\ 

(2) 

^fc ~ Cfe_i.i(5fe + Sk-i,iOtk [Previous left reflection. . .] 

7fc = Sk-i,iSk — Cfe-i,ia* [on middle two entries of T k e k ■ • •] 

£fe+i = Sfc-i^i/Sfe+i [produces first two entries in T k +ie k +il 

Sk+l = —Ck-l,lPk+l 

SymOrtho(7 fc ,/? fc+ i) 



(2) 

Cki,Ski,T k ' 

(6) 

Cfe2,Sfc2,7fc-2 
5 (3) 



SymOrtho(7fJ 2 ,e fc ) 


(2) 



V k-1 

(5) 

Cfc3,«fc3,7fc-1 



Sk2&k-l 

= Cfc 2 ^fc-1 + Sfc2<5^ 



Cfe24 2) , 



(3) 

Tk 



SymOrtho(7i 4 J 1 ,4 3) ) 



(3) 
S*37fc , 



(4) 

7fc 



(3) 
-Cfe37fe 



Tk = Cfei0 fc _l 

4>k = Ski<j>k-i, i>k-i = 0fc-i||[7fe <5fc+i] || 
if fc = 1 then 7 min = 71 else 7 min «- min {7„ 

■Afc = max {A-i,Pfc, 7f_2 >7i 5 -i, l7fc 4) |} 

Wfc = ||[Wfc_l Tfc]||, K k A/7min 

Wfe = -{Ck2/Pk)q k + s k 2wf}_ 2 



[Current left reflection] 
[First right reflection] 

(2) (2) 
"Cfc27fc > »7fc = s k2T k 

[Second right reflection...] 
[to zero out 8^1 
[Last element of tkl 
[Update ||f fc ||, ||iffc-i||] 
>7fc_2'7fc-i> \l k 1} 

[Update \\A\\} 
[Update ||Ar fc ||, cond(,4)] 
[Update w fc _2, ltfk-i, Wfc] 



(4) 
fc-2 



if fc > 2 then wf ' 

,(3) 



(Sk2/Pk)qk + c k 2W k 3 l 



(2) 



Cfc3 w fe, 

if fc > 2 then /j^ 2 = ( r k-2 - »7k-2/4-4 
if fc > 1 then ^\ = (r k -i - ^-1^-3 



(3) (2) 
•J k -1 = C k3 w k-1 + Sfc 3 Wfe 



#k-2Hk%)h { k % 

d {2) u (:i) W 5) 

"k-l^k-2)l l k -l 



.(3) 



,(2) 



,/4) 



[Update /J fe _ 2 ] 
[Update fik-il 

if 7^ 4) then fx k = {r k - f/fc/4-2 ~ ^k^k-i)/^ else Mfc = [Compute /_t fc ] 

[Update Xfe_2] 
[Compute Xfc] 
[Update ||a;fc_2||] 
[Compute ||xfe||] 



(2) _ (2) (3) (3) 

X k-2 — x k-3 ' L i k-2 W k-2 

(2) , (2) (3) (2) 
X k = x k-2 + lA-lK-1 + VkW k 

(2) 
X k -2 

Xk = 



= IIM£ S 4-2111 
ir v (2) ,,(2) , 
|[Xfc-2 fJ-k-1 MfcJ 



37 I = = 0fc, ^ = 0fc||[7fc+l 5fc+2]||, X = Xfc. -4 = Ak, 

output: x, <j>, if), x, A, k, u> 



UJ — UJk 
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Algorithm 2: Algorithm SymOrtho. 
input: a,b 

1 if b = then s = 0, r — \a\ 

2 if a = then c = 1 else c = sign(a) 

3 else if a = then 

4 L c = 0, s = sign(6), r = \b\ 

s else if |6| > |a| then 

6 |_ t = a/b, s — sign(6)/\/i + r 2 , c = sr, r = b/s 

7 else if a| > |b| then 

s |^ T = b/a, c = sign(a)/Vl + r 2 , s = cr, r — a/c 
output: c, s,r 



| minresqlpTestProgram.f90 I >\ minresqlpTestModule.f90 \ ' zminresqlpTestModule.f90 k [zminresqlpTestProg 




minresqlpModule.f90 
minresqlpBlasModule.f90 




| minresqlpP.eadMtxModule.f90 j zminresqlpBlasModule.f90 



_ioModule.f90 I 



minresqlpDataModule.f90 



zminresqlpDataModule.f90 



Fig. 1. FORTRAN 90 source files and their dependencies. Filenames boxed in broken lines are 
optional, and the corresponding files are used mainly for testing and demonstration. 

4. mm_ioModule . f 90 and minresqlpReadMtxModule . f 90: packages subroutines for 
reading Matrix Market files [Matrix Market; Burkardt] 

5. minresqlpTestModule.f90: illustrates how MINRES-QLP can call Aprod or 
Msolve with a short fixed parameter list, even if it needs arbitrary other data 

6. minresqlpTestProgram.f90: contains the main driver program for unit tests 

7. Makef ile: compiles the FORTRAN source files via the Unix command make 

8. minresqlp_f 90 .README: contains information about software license, other files 
in the package, and program compilation and execution. 

The counterparts of these programs for Hermitian problems have the same filenames 
prefixed with the letter "z" . 

In our FORTRAN 90 implementation, we use modules instead of the obsolete 
FORTRAN 77 COMMON blocks for grouping programs units and data together and 
controlling their availability to other program units. A module can use public data 
and subroutines from other modules (by declaring an interface block), share its 
own public data and subroutines with other program units, and hide its own 
private data and subroutines from being used by other program units. 

In minresqlpModule . f 90 we define a public subroutine MINRESQLP that imple- 
ments Algorithm 1. Two input arguments of this subroutine, Aprod and Msolve, 
are external user-defined subroutines — we recommend they be private for data 
integrity. The subroutine Aprod defines the matrix A as an operator. For a given 
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vector x, the FORTRAN statement call Aprod(n, x, y) must return the product 
y = Ax without altering x. The subroutine Msolve is optional, and it defines a sym- 
metric positive-definite matrix as an operator M that serves as a preconditioner. 
For a given vector y, the FORTRAN statement call MSolve(n, y, x) must solve 
the linear system Mx = y without altering y. To provide the compiler the necessary 
information about these private subroutines defined in minresqlpTestModule, an 
interface block in subroutine MINRESQLP is declared, which essentially replicates 
the headers of Aprod and Msolve in minresqlpTestModule. 

A public routine minresqlptest, also defined in module minresqlpTestModule, 
calls MINRESQLP with Aprod and Msolve passed to MINRESQLP as parameters. 

We declare all data variables in minresqlpTestModule used for defining Aprod 
and Msolve to be private so that they are accessible to all the subroutines in the 
module but not outside. 

To summarize, we have described and provided a pattern that allows MINRES- 
QLP users to solve different problems by simply editing minresTestModule (and 
possibly the main program minresTestProgram, which calls minresqlptest). Users 
do not need to change MINRESQLP as long as the header of subroutines Aprod and 
Msolve stay the same in minresTestModule. 

Our design spares users from implementing reverse communication, and hence 
enables the development of iterative methods without a priori knowledge of users' 
problem data A and M (by returning control to the calling program every time 
Aprod or Msolve is to be invoked). While reverse communication is widely used 
in scientific computing with FORTRAN 77, the resulting code usually appears 
formidable and unrecognizable from the original pseudocode; see [Dongarra et al. 
1995] and [Oliveira and Stewart 2006] for two examples of CG and numerical in- 
tegration coded in FORTRAN 77 and 90, respectively. Our MINRES-QLP imple- 
mentation achieves the purpose of reverse communication while preserving code 
readability and thus maintainability. The FORTRAN 90 module structure allows a 
user's Ax products and Mx = y solves to be implemented outside MINRES-QLP in 
the same way that MATLAB's function handles operate. 

Finally, unit testing is an important software development strategy that cannot 
be overemphasized, especially in the scientific computing communities. Unit test- 
ing usually consists of multiple small and fast but specific and illuminating test 
cases that check whether the code behaves as designed. Software development is 
incremental, and errors (also known as bugs) are often found over time. Adding 
new functionalities or fixing errors often breaks the code for some earlier successful 
test cases. It is therefore critical to expand the test cases and to ensure that all 
unit tests are executed with expected results every time a program unit is updated. 

In our development of FORTRAN 90 MINRES-QLP, we have created a suite of 52 
test cases including singular matrices representative of real- world applications [Fos- 
ter 2009; Davis and Hu 2011]. The test program outputs results to MINRESQLP.txt. 
If users need to modify subroutine MINRESQLP, they can run these test cases and 
search for the word "appear" in the output file to check whether all tests are re- 
ported to be successful. For more sophisticated unit testing frameworks employed 
in large-scale scientific software development, see [O'Boyle et al. 2008]. 
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Further details on interface and implementation, with additional numerical ex- 
amples and documentation, are given in [Choi and Saunders 2012]. 

As a last note, careful choices of parameter values are critical in the convergence 
behavior of iterative solvers. While the default parameter values in MINRES-QLP 
work well in most tests, they may need to be fine-tuned in some cases by trial and 
error, solving a series of problems as in iterative regularization, or partial or full 
reorthogonalization of the Lanczos vectors. 
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